Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Machine Recognition of Printed Kannada Text

Identifieur interne : 001884 ( Main/Exploration ); précédent : 001883; suivant : 001885

Machine Recognition of Printed Kannada Text

Auteurs : B. Vijay Kumar [Inde] ; G. Ramakrishnan [Inde]

Source :

RBID : ISTEX:3CE7016F0ECDCB6C7E187ECFA957CB5BEA325CF8

Descripteurs français

English descriptors

Abstract

Abstract: This paper presents the design of a full fledged OCR system for printed Kannada text. The machine recognition of Kannada characters is dificult due to similarity in the shapes of different characters, script complexity and non-uniqueness in the representation of diacritics. The document image is subject to line segmentation, word segmentation and zone detection. From the zonal information, base characters, vowel modifiers and consonant conjucts are separated. Knowledge based approach is employed for recognizing the base characters. Various features are employed for recognising the characters. These include the coefficients of the Discrete Cosine Transform, Discrete Wavelet Transform and Karhunen-Louve Transform. These features are fed to different classifiers. Structural features are used in the subsequent levels to discriminate confused characters. Use of structural features, increases recognition rate from 93% to 98%. Apart from the classical pattern classification technique of nearest neighbour, Artificial Neural Network (ANN) based classifiers like Back Propogation and Radial Basis Function (RBF) Networks have also been studied. The ANN classifiers are trained in supervised mode using the transform features. Highest recognition rate of 99% is obtained with RBF using second level approximation coefficients of Haar wavelets as the features on presegmented base characters.

Url:
DOI: 10.1007/3-540-45869-7_4


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI wicri:istexFullTextTei="biblStruct">
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Machine Recognition of Printed Kannada Text</title>
<author>
<name sortKey="Vijay Kumar, B" sort="Vijay Kumar, B" uniqKey="Vijay Kumar B" first="B." last="Vijay Kumar">B. Vijay Kumar</name>
</author>
<author>
<name sortKey="Ramakrishnan, G" sort="Ramakrishnan, G" uniqKey="Ramakrishnan G" first="G." last="Ramakrishnan">G. Ramakrishnan</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:3CE7016F0ECDCB6C7E187ECFA957CB5BEA325CF8</idno>
<date when="2002" year="2002">2002</date>
<idno type="doi">10.1007/3-540-45869-7_4</idno>
<idno type="url">https://api.istex.fr/document/3CE7016F0ECDCB6C7E187ECFA957CB5BEA325CF8/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000A04</idno>
<idno type="wicri:Area/Istex/Curation">000992</idno>
<idno type="wicri:Area/Istex/Checkpoint">000F98</idno>
<idno type="wicri:doubleKey">0302-9743:2002:Vijay Kumar B:machine:recognition:of</idno>
<idno type="wicri:Area/Main/Merge">001964</idno>
<idno type="wicri:source">INIST</idno>
<idno type="RBID">Pascal:03-0248638</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000621</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000170</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000607</idno>
<idno type="wicri:doubleKey">0302-9743:2002:Vijay Kumar B:machine:recognition:of</idno>
<idno type="wicri:Area/Main/Merge">001A60</idno>
<idno type="wicri:Area/Main/Curation">001884</idno>
<idno type="wicri:Area/Main/Exploration">001884</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title level="a" type="main" xml:lang="en">Machine Recognition of Printed Kannada Text</title>
<author>
<name sortKey="Vijay Kumar, B" sort="Vijay Kumar, B" uniqKey="Vijay Kumar B" first="B." last="Vijay Kumar">B. Vijay Kumar</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Inde</country>
<wicri:regionArea>Department of Electrical Engineering, Indian Institute of Science, 560012, Bangalore</wicri:regionArea>
<wicri:noRegion>Bangalore</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Inde</country>
</affiliation>
</author>
<author>
<name sortKey="Ramakrishnan, G" sort="Ramakrishnan, G" uniqKey="Ramakrishnan G" first="G." last="Ramakrishnan">G. Ramakrishnan</name>
<affiliation wicri:level="1">
<country xml:lang="fr">Inde</country>
<wicri:regionArea>Department of Electrical Engineering, Indian Institute of Science, 560012, Bangalore</wicri:regionArea>
<wicri:noRegion>Bangalore</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1">
<country wicri:rule="url">Inde</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series>
<title level="s">Lecture Notes in Computer Science</title>
<imprint>
<date>2002</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">3CE7016F0ECDCB6C7E187ECFA957CB5BEA325CF8</idno>
<idno type="DOI">10.1007/3-540-45869-7_4</idno>
<idno type="ChapterID">4</idno>
<idno type="ChapterID">Chap4</idno>
</biblStruct>
</sourceDesc>
<seriesStmt>
<idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Character recognition</term>
<term>Discrete cosine transforms</term>
<term>Discrete transformation</term>
<term>Haar function</term>
<term>Image processing</term>
<term>Image segmentation</term>
<term>Karnataka</term>
<term>Neural network</term>
<term>Optical character recognition</term>
<term>Pattern classification</term>
<term>Pattern recognition</term>
<term>Radial basis function</term>
<term>Wavelet transformation</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Classification forme</term>
<term>Fonction Haar</term>
<term>Fonction base radiale</term>
<term>Karnataka</term>
<term>Reconnaissance caractère</term>
<term>Reconnaissance forme</term>
<term>Reconnaissance optique caractère</term>
<term>Réseau neuronal</term>
<term>Segmentation image</term>
<term>Traitement image</term>
<term>Transformation cosinus discrète</term>
<term>Transformation discrète</term>
<term>Transformation ondelette</term>
</keywords>
</textClass>
<langUsage>
<language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Abstract: This paper presents the design of a full fledged OCR system for printed Kannada text. The machine recognition of Kannada characters is dificult due to similarity in the shapes of different characters, script complexity and non-uniqueness in the representation of diacritics. The document image is subject to line segmentation, word segmentation and zone detection. From the zonal information, base characters, vowel modifiers and consonant conjucts are separated. Knowledge based approach is employed for recognizing the base characters. Various features are employed for recognising the characters. These include the coefficients of the Discrete Cosine Transform, Discrete Wavelet Transform and Karhunen-Louve Transform. These features are fed to different classifiers. Structural features are used in the subsequent levels to discriminate confused characters. Use of structural features, increases recognition rate from 93% to 98%. Apart from the classical pattern classification technique of nearest neighbour, Artificial Neural Network (ANN) based classifiers like Back Propogation and Radial Basis Function (RBF) Networks have also been studied. The ANN classifiers are trained in supervised mode using the transform features. Highest recognition rate of 99% is obtained with RBF using second level approximation coefficients of Haar wavelets as the features on presegmented base characters.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>Inde</li>
</country>
</list>
<tree>
<country name="Inde">
<noRegion>
<name sortKey="Vijay Kumar, B" sort="Vijay Kumar, B" uniqKey="Vijay Kumar B" first="B." last="Vijay Kumar">B. Vijay Kumar</name>
</noRegion>
<name sortKey="Ramakrishnan, G" sort="Ramakrishnan, G" uniqKey="Ramakrishnan G" first="G." last="Ramakrishnan">G. Ramakrishnan</name>
<name sortKey="Ramakrishnan, G" sort="Ramakrishnan, G" uniqKey="Ramakrishnan G" first="G." last="Ramakrishnan">G. Ramakrishnan</name>
<name sortKey="Vijay Kumar, B" sort="Vijay Kumar, B" uniqKey="Vijay Kumar B" first="B." last="Vijay Kumar">B. Vijay Kumar</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001884 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001884 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     ISTEX:3CE7016F0ECDCB6C7E187ECFA957CB5BEA325CF8
   |texte=   Machine Recognition of Printed Kannada Text
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024